Pruning Regression Trees with MDL

نویسندگان

  • Marko Robnik-Sikonja
  • Igor Kononenko
چکیده

Pruning is a method for reducing the error and complexity of induced trees. There are several approaches to pruning decision trees, while regression trees have attracted less attention. We propose a method for pruning regression trees based on the sound foundations of the MDL principle. We develop coding schemes for various constructs and models in the leaves and empirically test the new method against two well know pruning algorithms. The results are favourable to the new method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MDL-Based Decision Tree Pruning

This paper explores the application of the Min imum Description Length principle for pruning decision trees We present a new algorithm that intuitively captures the primary goal of reduc ing the misclassi cation error An experimental comparison is presented with three other prun ing algorithms The results show that the MDL pruning algorithm achieves good accuracy small trees and fast execution ...

متن کامل

Maximum a posteriori pruning on decision trees and its application to bootstrap BUMPing

The cost-complexity pruning generates nested subtrees and selects the best one. However, its computational cost is large since it uses holdout sample or cross-validation. On the other hand, the pruning algorithms based on posterior calculations such as BIC (MDL) and MEP are faster, but they sometimes produce too big or small trees to yield poor generalization errors. In this paper, we propose a...

متن کامل

Proper versus Ad-Hoc MDL Principle for Polynomial Regression

The paper deals with the task of polynomial regression, i.e., inducing polynomial that can be used to predict a chosen dependent variable based on the values of independent ones. As in other induction tasks, there is a trade-off between the complexity of the induced polynomial and its predictive error. One of the approaches for searching an optimal trade-off is the Minimal Description Length pr...

متن کامل

Learning with data adaptive features

The cost-complexity pruning generates nested subtrees and selects the best one. However, its computational cost is large since it uses hold-out sample or crossvalidation. On the other hand, the pruning algorithms based on posterior calculations such as BIC (MDL) and MEP are faster, but they sometimes produce too big or small trees to yield poor generalization errors. In this paper, we propose a...

متن کامل

Appears in Ecml-98 as a Research Note a Longer Version Is Available as Ece Tr 98-3, Purdue University Pruning Decision Trees with Misclassiication Costs 1 Pruning Decision Trees

We describe an experimental study of pruning methods for decision tree classiiers when the goal is minimizing loss rather than error. In addition to two common methods for error minimization, CART's cost-complexity pruning and C4.5's error-based pruning, we study the extension of cost-complexity pruning to loss and one pruning variant based on the Laplace correction. We perform an empirical com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998